Conversation
| message TimeSkippingConfig { | ||
|
|
||
| // If set, enables automatic time-skipping for this workflow execution. | ||
| // It can also be disabled by setting this field to false. |
There was a problem hiding this comment.
@Sushisource Hey! I've cleaned this pr and ready for review.
And this config may need special attention since it is changed significantly from previous version Chad proposed.
(1)deleted most of the fine-grained control (like max timer-firing count, max duration to skip) from the previous version. The reason is it is ambiguous when complicated configuration is propagated to child-workflows.
(2) I have also changed the nested config to one single layer because this configuration is only used for automatic time-skipping, manual time-skipping will have a separate request and response payload
(3) we allow disabling time-skipping for in-flight workflows since it is not hard for the server to have this feature, and it will be a nice-to-have feature same as the SDK TestServer.
I think we can consider wrapping up this pr early, and put manual time-skipping in a separate pr maybe.
Pls feel free to disagree, happy to refine.
There was a problem hiding this comment.
I think this looks totally reasonable - but I also don't see any reason to rush the PR while work is in progress unless we intend to ship a version without manual skipping first. If we don't, let's use the opportunity to develop it and decide if we want to make any changes to this config that relate to manual timeskipping.
83f6639 to
e1e1c03
Compare
| message TimeSkippingConfig { | ||
|
|
||
| // If set, enables automatic time-skipping for this workflow execution. | ||
| // It can also be disabled by setting this field to false. |
There was a problem hiding this comment.
I think this looks totally reasonable - but I also don't see any reason to rush the PR while work is in progress unless we intend to ship a version without manual skipping first. If we don't, let's use the opportunity to develop it and decide if we want to make any changes to this config that relate to manual timeskipping.
4aeceb6 to
094e2c5
Compare
9721ff7 to
c686992
Compare
| // An event that indicates that the previously paused workflow execution has been unpaused. | ||
| EVENT_TYPE_WORKFLOW_EXECUTION_UNPAUSED = 59; | ||
| // An event that indicates that some duration was skipped for this workflow execution. | ||
| EVENT_TYPE_WORKFLOW_EXECUTION_TIME_SKIPPED = 60; |
There was a problem hiding this comment.
You know this just occurred to me and I don't think we've talked about it - if time skipping is enabled on a workflow with an old SDK version processing it, it's going to break on these new event types.
That's fine I think, we will tell people they need to upgrade, but we need to make sure that is clear in the docs & GTM material we create.
There was a problem hiding this comment.
get it. the docs & GTM material mainly refer to API comments (here) and https://docs.temporal.io/ ?
There was a problem hiding this comment.
if we always set worker_may_ignore for TimeSkippingEvents, it won't break workers right? because in the demo, I set worker_may_ignore = true and used current SDK which has no time-skipping feature and didnt break it
There was a problem hiding this comment.
I think for sdk version old enough, it doesn't even know about the worker_may_ignore flag, hence the comment. but yeah this is a new feature so we can make it clear re. the sdk version required.
| // when true, it means the automatic time-skipping was enabled with a bound | ||
| // and now the bound is reached and the time-skipping is disabled automatically. | ||
| bool bound_reached_and_time_skipping_disabled = 2; |
There was a problem hiding this comment.
| // when true, it means the automatic time-skipping was enabled with a bound | |
| // and now the bound is reached and the time-skipping is disabled automatically. | |
| bool bound_reached_and_time_skipping_disabled = 2; | |
| // when true, automatic time-skipping was enabled with a bound | |
| // and now that bound has been reached and time-skipping is disabled automatically. | |
| bool disabled_after_bound = 2; |
This is my best idea for a better name. The comment makes things clear, so I don't think we need a super wordy field name.
c686992 to
837c381
Compare
837c381 to
012f767
Compare
| // i.e. the existing time-skipping conf will be preserved. | ||
| TimeSkippingConfig time_skipping_config = 3; | ||
| } | ||
|
|
There was a problem hiding this comment.
@Sushisource @yycptt updated this part especially the bound part according to discussions. With the new fine-grained bound, I think the 1h-signal can be easily implemented by setting max_elapsed_duration to 1hour or max_target_time to some specific time in 1h hour.
For server, a EVENT_TYPE_WORKFLOW_EXECUTION_TIME_SKIPPED will be created to indicate time-skipping is disabled by bound even if no duration skipped.
For SDK can poll this event, but I think there are two things to consider right now
- we may need to add a version/ID for the time-skipping config if the user resets this config multiple times so the poller can differentiate EVENT_TYPE_WORKFLOW_EXECUTION_TIME_SKIPPED easily
- right now there is one get-api to read history events, it may be easier if we build a dedicated polling api (but this one may be added later?
There was a problem hiding this comment.
I think for me the confusing part is those two options didn't say it will, like, create a new timer at the specified bound. and I am under the impression that time skipping will only skip to existing timers within the bound.
There was a problem hiding this comment.
yes it only skip timers, but we allow this to be turned off automatically without finding anything to skip
do you have recommendations how to change ?
| string request_id = 3; | ||
| } | ||
|
|
||
| // Attributes for an event indicating that virtual time was advanced for a workflow execution, |
There was a problem hiding this comment.
⭐ highlight the change after review meeting
There was a problem hiding this comment.
We should also add to the comment here that worker_may_ignore should always be set true for this event
api/temporal/api/history/v1/message.proto
Line 1127 in 2547d30
There was a problem hiding this comment.
added but I remember an early proposal was to break the workers and remind them to upgrade. So I guess our strategy changed?
There was a problem hiding this comment.
Sorry, I meant we should add this to the comment here, not on the field. Otherwise that field will have an ever-increasing list in the comment.
And I think that was before we realized workers don't need to handle this at all. Even old SDKs will be able to successfully replay histories with time-skipping, which could be useful if users want to try this without needing to upgrade SDKs.
There was a problem hiding this comment.
I guess I located here correctly this time?
| // This bound is not propagated to child workflows. | ||
| // It is recommended to set disable_propagation to true | ||
| // and configure TimeSkippingConfig explicitly in child workflows. | ||
| oneof bound { |
There was a problem hiding this comment.
⭐ highlight the change after review meeting
There was a problem hiding this comment.
looks great!
max_elapsed_duration may be a bit hard to use due to real time part of it.
For the timer + signal case we talked about, which option should caller use? max_elapsed_duration?
sorry I see you answered this question in the other comment.
| string request_id = 3; | ||
| } | ||
|
|
||
| // Attributes for an event indicating that virtual time was advanced for a workflow execution, |
There was a problem hiding this comment.
We should also add to the comment here that worker_may_ignore should always be set true for this event
api/temporal/api/history/v1/message.proto
Line 1127 in 2547d30
| // An event that indicates that the previously paused workflow execution has been unpaused. | ||
| EVENT_TYPE_WORKFLOW_EXECUTION_UNPAUSED = 59; | ||
| // An event that indicates that some duration was skipped for this workflow execution. | ||
| EVENT_TYPE_WORKFLOW_EXECUTION_TIME_SKIPPED = 60; |
There was a problem hiding this comment.
I think for sdk version old enough, it doesn't even know about the worker_may_ignore flag, hence the comment. but yeah this is a new feature so we can make it clear re. the sdk version required.
| message WorkflowExecutionTimeSkippedEventAttributes { | ||
|
|
||
| // The virtual time after time skipping is applied. | ||
| google.protobuf.Timestamp target_time = 1; |
There was a problem hiding this comment.
The timestamp of this event will be the workflow (virtual) time before or after the skip? I guess it's before? and the difference between the event time and the target time is the duration skipped?
There was a problem hiding this comment.
emm this is after, right now at sometime for example 4:00 virtual time (which is 3:00 in real time) we are going to skip 1hour to 5:00 and the target_time is 5:00
| bool disabled_after_bound = 2; | ||
|
|
||
| // The wall-clock time when the time-skipping event happened. | ||
| google.protobuf.Timestamp wall_clock_time = 3; |
There was a problem hiding this comment.
I am trying to understand this field v.s. the event time of the event. I mean I understand that this is always wal clock and event time can be a virtual time, but want to see if you any use case where this field will help make it easier.
There was a problem hiding this comment.
this was discussed in the review-meeting only for better visibility for debugging purpose. for example, I am an user, and I am debugging something with my business system built on temporal, now I need to align temporal history (which is virtual time) with my own log (which is real time)
definitely, it will be better but heavier if we add all wallclock to all following events
| // If set, the enabled field is not propagated to child workflows. | ||
| bool disable_propagation = 2; |
There was a problem hiding this comment.
should clarify if this will propagate via continue as new or reset as well.
There was a problem hiding this comment.
I am thinking we make it simple now: we don't propagate bound and propagate enable for all these three cases. And this feature is not supported for nexus operation so no propagation for now. make sense?
|
|
||
| message DeleteActivityExecutionResponse { | ||
| } | ||
| } No newline at end of file |
There was a problem hiding this comment.
emm it seems not to be a new line >?
| // Configuration for automatic time skipping during a workflow execution. | ||
| // When enabled, virtual time advances automatically whenever there are | ||
| // no in-flight activities, child workflows, or Nexus operations. | ||
| message TimeSkippingConfig { |
There was a problem hiding this comment.
do we need to add this in to child workflow command?
There was a problem hiding this comment.
Specifically, I don't imagine we will need to set childWorkflow time-skipping config thru command when users use SDK testing env. And in general, I think we may keep this as complex as the current version and only consider adding more when SDK really needs it.
| // Configuration for automatic time skipping during a workflow execution. | ||
| // When enabled, virtual time advances automatically whenever there are | ||
| // no in-flight activities, child workflows, or Nexus operations. | ||
| message TimeSkippingConfig { |
There was a problem hiding this comment.
nit: Wonder if we need to call this AutoTimeSkippingConfig in case we add manual skipping support in the future.
There was a problem hiding this comment.
I am kind of yes and no for different reasons. Guts-feeling is we will not get there that adding manual skipping.
what do you think @Sushisource ?
There was a problem hiding this comment.
On second thought, I’m leaning against renaming it. In my mind, 'TimeSkipping' implies an automatic process. Let’s keep it simple and only add a 'Manual' prefix when we need to distinguish the manual version?
| // This bound is not propagated to child workflows. | ||
| // It is recommended to set disable_propagation to true | ||
| // and configure TimeSkippingConfig explicitly in child workflows. | ||
| oneof bound { |
There was a problem hiding this comment.
looks great!
max_elapsed_duration may be a bit hard to use due to real time part of it.
For the timer + signal case we talked about, which option should caller use? max_elapsed_duration?
sorry I see you answered this question in the other comment.
|
|
||
| // Configuration for automatic time skipping during a workflow execution. | ||
| // When enabled, virtual time advances automatically whenever there are | ||
| // no in-flight activities, child workflows, or Nexus operations. |
There was a problem hiding this comment.
I wonder if we need to block auto skipping when there's pending signal/cancel external workflow operations in the transfer queue as well.
There was a problem hiding this comment.
I think these are good scenarios for testing, I think I can record these in testing planning.
For desgin,
(1) last time I learned signal/update, they come with workflow tasks so covered by workflow tasks ?
(2) for cancel task in transfer queue, the ms state becomes completed no? and I think our poc already check this state together with isAutoSkippable
pls feel free to correct me if I misunderstand how the sys works right now

What changed?
new feature: automatic time-skipping configuration and new event type added
Breaking changes
new event type added but right now no workflow history will have this new event type
Server PR
not expect to break server, no new rpc-handler and new fields are all optional